Á¤º¸°úÇÐȸ ³í¹®Áö C : ÄÄÇ»ÆÃÀÇ ½ÇÁ¦
Current Result Document :
ÇѱÛÁ¦¸ñ(Korean Title) |
´ë¿ë·® ÇÑ±Û ¹®¼ÀÇ ¿ø¹® º¸È£ Ž»ö ±â¹ý |
¿µ¹®Á¦¸ñ(English Title) |
An Original Text Protecting Search Method for Huge Korean Documents |
ÀúÀÚ(Author) |
¹Ú ¼± ¿µ
±è ¼º ȯ
Á¶ ȯ ±Ô
Sun-Young Park
Sung-Hwan Kim
Hwan-Gue Cho
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 18 NO. 07 PP. 0563 ~ 0567 (2012. 07) |
Çѱ۳»¿ë (Korean Abstract) |
À¯»ç ¹®¼ Ž»ö ½Ã½ºÅÛÀÇ °³¹ßÀÌ ²ÙÁØÈ÷ ÀÌ·ç¾îÁö°í ÀÖ´Â °¡¿îµ¥, À¯»ç ¹®¼ Ž»öÀ» À§ÇÑ µ¥ÀÌÅÍ ¼öÁý ¹®Á¦°¡ ÀúÀ۱ǰú °ü·ÃÇÏ¿© Å« ¹®Á¦°¡ µÇ°í ÀÖ´Ù. ¸¸¾à À¯»ç¹®¼ Ž»ö ½Ã½ºÅÛÀÌ ÀúÀÛ±ÇÀÚµéÀÇ ÀúÀÛ¹°À» º¹¿øÇÒ ¼ö ¾øµµ·Ï º¯È¯ÇÏ¿© º¸°üÇÏ´Â °ÍÀ» º¸ÀåÇÑ´Ù¸é, ÀúÀÛ±ÇÀÚµéÀÌ µ¥ÀÌÅ͸¦ Á¦°øÇÏ´Â µ¥¿¡ µå´Â °ÅºÎ°¨À» ¿ÏÈÇÒ ¼ö ÀÖÀ» °ÍÀÌ´Ù. º» ³í¹®¿¡¼´Â ÃʼºÀ» ÀÌ¿ëÇÑ ÇÑ±Û ½ºÅ² ÃßÃâ ¹æ¹ýÀ» ÀÌ¿ëÇÑ ¿ø¹® º¸È£°¡ ÀÌ·ç¾îÁö¸é¼ ƯÁ¤ ´Ü¾î³ª ¹®ÀåÀÌ Á¸ÀçÇÏ´ÂÁö Ž»öÇÒ ¼ö ÀÖ´Â ½Ã½ºÅÛÀ» Á¦¾ÈÇÑ´Ù. Á¦¾ÈÇÏ´Â ½Ã½ºÅÛÀº ÇÑ±Û ¹®¼ÀÇ ÃʼºÀ» ÃßÃâÇÏ°í, ¹ö·Î¿ìÁî-ÈÙ·¯ º¯È¯(Burrows-Wheeler Transformation)À» ¼öÇàÇÏ¿© Á¢¹Ì»ç ¹è¿ Á¤º¸¿Í ¿ø¹® Á¤º¸¸¦ ÃÖ¼ÒÇÑÀÇ ¿ë·®À¸·Î ÀúÀåÇÑ´Ù. ½ÇÇè °á°ú 20ÀÚ ÀÌ»óÀÇ ¹®Àå¿¡ ´ëÇÏ¿© ½Å¼ÓÇÏ°í Á¤È®ÇÑ °Ë»öÀÌ °¡´ÉÇÔÀ» º¸¿´´Ù. ¶ÇÇÑ 1〜2ÀÚÀÇ ºÒÀÏÄ¡¸¦ Çã¿ëÇϴ Ž»ö°ú 80% ºÎºÐ ÀÏÄ¡ Ž»ö ¹æ¹ýÀ» Á¦¾ÈÇÏ°í °¢°¢ 5ÀÚ, 15ÀÚ ÀÌ»óÀÇ ÁúÀǾ ´ëÇÏ¿© È¿°úÀûÀ¸·Î µ¿ÀÛÇÔÀ» È®ÀÎÇÏ¿´´Ù. |
¿µ¹®³»¿ë (English Abstract) |
While similar document searching systems have been developed steadily, data collection to be searched is becoming a problem related to copyright law. If such searching systems guarantee that they store documents as a converted form which is secure and cannot be recovered, it makes much uncomplicated to get agreements from authors. In this paper, based on the fact that first phonemes of Korean sentences have much information than other phonemes, we propose a searching method which provides the protection of original text using Korean skin extraction. The proposed system extracts first phonemes(skin) from given Korean documents, and store them by BWT(Burrows-Wheeler Transform) to minimize the size while containing the information of original text and its suffix array. By experiment, we show that the searching is a quite fast and accurate with a query longer than 20. We also present a method to search allowing 1¢¦2 mismatches and to find partial matching of 80%. We experimentally demonstrate these methods works effectively with queries longer than 5 and 15 respectively. |
Å°¿öµå(Keyword) |
À¯»ç ¹®¼ Ž»ö
¹ö·Î¿ìÁî-ÈÙ·¯ º¯È¯
Ãʼº ½ºÅ²
Similar Document Searching
Burrows- Wheeler Transform
First Phoneme Skin
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|